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By 2015 the advanced versions of the gravitational-wave detectors Virgo and LIGO will be on- 
line. They will collect data in coincidence with enough sensitivity to potentially deliver multiple 
detections of gravitation waves from inspirals of compact-object binaries. This work is focused on 
understanding the effects introduced by uncertainties in the calibration of the interferometers. We 
consider plausible calibration errors based on estimates obtained during LIGO's fifth and Virgo's 
third science runs, which include frequency-dependent amplitude errors of ~ 10% and frequency- 
dependent phase errors of ~ 3 degrees in each instrument. We quantify the consequences of such 
errors estimating the parameters of inspiraling binaries. We find that the systematics introduced 
by calibration errors on the inferred values of the chirp mass and mass ratio are smaller than 20% 
of the statistical measurement uncertainties in parameter estimation for 90% of signals in our mock 
catalog. Meanwhile, the calibration-induced systematics in the inferred sky location of the signal 
are smaller than ~ 50% of the statistical uncertainty. We thus conclude that calibration-induced 
errors at this level are not a significant detriment to accurate parameter estimation. 



I. INTRODUCTION 

The detection of gravitational waves (GWs) will 
give us empirical access to the genuinely strong-field 
dynamics of space-time and allow us to probe astro- 
physical phenomena inaccessible through electromag- 
netic observations alone. Despite indirect proofs, like 
the shrinking of the orbit in the Hulse- Taylor binary, 
which is in excellent agreement with the theoretical 
calculation [JJ , a direct detection of GWs is yet to oc- 
cur. Gravitational-wave detectors based on interfer- 
ometry: the two LIGO instruments [2], VIRGO (HE] 
and GEO600 [7] [8] , have collected data in coincidence 
trough October 2010. The most recent published re- 
sults [9] [10] , which cover the period 4 November 2005 
- 30 September 2007, do not claim detections. The 
LIGO instruments and Virgo will undergo major im- 
provements in the next few years, and will begin col- 
lecting data again by 2015, with an improved sensitiv- 
ity [3] |6] that may allow for frequent detections [UJ , 
ushering in the so-called advanced detector era. 

Apart from the intrinsic scientific importance of a 
first direct detection, the advanced versions of the in- 
struments will open a new era of astronomy and cos- 
mology, in which GWs will be used to test the strong- 
field regime of General Relativity [T3J UMTS] ; to set 
better bounds for the values of the cosmological pa- 
rameters pH ITM251 145] ; to check the validity of the 
equations of state for neutron stars [26 ; to probe the 
astrophysics of binary evolution jJJ]; etc. 

In order to extract as much physical information 
as possible, all the known sources of error must be 
eliminated, reduced or quantified. Among the known 
sources of errors, there are calibration errors, i.e. er- 



rors on the measurement of the transfer function, 
which converts the readout of the instruments to the 
strain used for data analysis. 

These errors will have consequences for the estima- 
tion of the intrinsic and extrinsic parameters of the 
source of GWs, as the data analyst will infer an in- 
correct data stream. Some previous works have dealt 
with calibration errors, in the context of detection ef- 
ficiency using template banks [3SJ and parameter es- 
timation [66 , but a complete treatment requires the 
use of numerical methods, because the high dimen- 
sionality of the problem and the correlations between 
the unknown parameters on which the GWs depend 
make it impossible to forecast the exact effects of cal- 
ibration errors analytically. 

In this article we have used a Bayesian approach 
to study and quantify these effects for the first time 
in the literature. We created catalogs of 250 software 
injections (i.e. signals of known shape added to syn- 
thetic noise) in each of three mass bins: one for bi- 
nary neutron star systems, one for binary black holes, 
and one for neutron star-black hole systems. We have 
generated ten different sets of calibration error curves, 
with shapes and magnitudes that should be represen- 
tative of the errors we expect to have in the advanced 
detector era. 

The catalogs of injections were analyzed twice: first, 
by running a Bayesian parameter-estimation code [671 
I70j on the original injections, and then by running the 
same code after artificially adding, one at a time, the 
calibration errors we had generated. As the presence 
of the errors was the only thing that had changed 
between one analysis and the other, the differences 
observed in the recovered parameters and the Bayes 
factors could only have been caused by the calibration 
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errors, and we were able to quantify these differences 
and relate them to the calibration errors. 

We have found that the effects are generally small, 
the shifts introduced in the estimated parameters be- 
ing a fraction of the statistical measurement errors due 
to the noise in the instruments. At the same time, the 
Bayes factors of the signals are only slightly affected 
by the errors we have considered, the average shift be- 
ing ~ 0.9%, so that if the Bayes factor were used as a 
detection statistic, in the way described in [57], there 
will not be signals that are going to be missed because 
of the way the errors have changed their shapes. 

This article is organized as follows: In Sec. [TT] 
we describe the interferometers and the process of 
calibration. 

In Sec. Mil we describe the errors associated with the 
calibration process, and how we model them. 
In Sec. |IV| we give some details about the Bayesian 
approach to parameter estimation and model selec- 
tion, with specific focus on gravitational-wave data 
analysis. 

In Sec. [V] we describe the method we have used 
to quantify the effects of calibration errors, and in 
the next Sec. 



VI we report the main results of our 



analysis. 



II. CALIBRATION TECHNIQUES 

Ground-based laser interferometric gravitational 
wave detectors operate in a Michelson interferome- 
ter type configuration, measuring the phase propaga- 
tion difference between two perpendicular arms with 
a phase accuracy of A/10 12 (A being the wavelength of 
the laser). In LIGO and Virgo, this is accomplished 
by enhancing the GW induced phase changes using 
4 km long Fabry-Perot resonators in each of the in- 
terferometer arms, optimizing the integration time of 
the detector to GWs of a few hundreds of hertz. In 
order to analyze the effects of calibration errors on 
parameter estimation, as we seek to do in this arti- 
cle, we abstract the incredibly complex interferometer 
to a single degree of freedom sensor, only sensitive to 
differential arm length (DARM) changes, which are 
expected to contain the gravitational wave signals. In 
order to operate such a sensor in a continuous fashion, 
the DARM signals are measured in closed loop feed- 
back, correcting the measured deviations and keeping 
the interferometer at the desired operating point. A 
reduced block schematic of the feedback loop involved 
is shown in Fig. [T] 

The schematic immediately indicates some kind of 
'in-loop' measurement, where any disturbance is sup- 
pressed by the control loop, leaving the interferometer 
output dependent on the performance of the feedback. 
In order to reconstruct the actual GW signal, we re- 




Figure 1. A schematic representation of the IFO with the 
subsystems described in the text. 



quire accurate knowledge (transfer functions) of all 
components within the feedback loop. It is the uncer- 
tainty in the overall loop transfer function that pro- 
vides us with an error on the calibration of our gravi- 
tational wave detector. The sensing method used pro- 
vides the differential phase measurement at the out- 
put of the interferometer and is based on the Pound- 
Drever-Hall (PDH) technique [37J[2H], Within the nec- 
essary bandwidth, the PDH technique provides a sig- 
nal, e(/), also called the error signal, that is propor- 
tional to the measured deviation. With reference to 
Fig. [T] we see that the external length perturbations, 
AL ext , transfer to the error signal by 



AL ext (f)=R(t,f)e(f), 



(2.1) 



where, e(/), is the error signal output coming from the 
interferometer and R{t, /) is the frequency dependent 
response of the closed loop feedback control system 
(the time dependence being there to recall that the 
behavior of the instrument changes with the time, see 
below). Within the interferometer calibration nomen- 
clature, R(t, /) is usually referred to as the length re- 
sponse function and completely describes the transfer 
function between the residual change in DARM and 
the digital error signal. The calibration of gravita- 
tional wave detectors is an entire study unto itself and 
much is involved in extracting an accurate response for 
different components within the feedback loop. Eval- 
uating the blocks in Fig. [I] shows that calibration of 
the detector output involves three main subsystems. 
The uncertainty in each of the subsystem's transfer 
functions carries with it a source of calibration errors, 
which defined by 

• The transfer function of the arm cavity C'(t, /), 
which is also known as the sensing function and 
can be split into a complex frequency dependent 
part and a slow varying time dependent part: 
C'(tJ) = C(f)a(t). 

• The digital filter D(f) is applied to the mea- 
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sured error signal and 'shapes' the feedback loop 
response time and the amount of disturbance re- 
jection from external noise. 

• The actuation function A(f) transfers the 
'knowledge' of the filtered error signal into a 
physical correction force on the interferometer. 
This can be, for example, the force exerted by 
a voice coil onto the test masses in the interfer- 
ometer arms. 

We can set up a set of self consistent equations that 
describes the behavior of the closed loop system. With 
reference to the variables in Fig. [I] these are, 



AL res — AL ex i 



e(/) = a(t)C(f)AL r 
dc(f) = e(f)D(f) 
x = A(f)dc(f). 



(2.2) 
(2.3) 
(2.4) 
(2.5) 



Rearranging the equations in Eq.(2.3l to Eq.(2.5), one 



can find, after some algebra, the explicit expression for 
the length transfer function term, R(t, /) as: 



R(t,f) = 



l + q(t)G(/) 
a(t)C(f) 



(2.6) 



where we introduced the loop gain function, G(f) = 
A(f)C(f)D(f), also known as the open loop gain of 
the system. The loop gain G(f) of the feedback sys- 
tem is obtained by breaking the loop at an arbitrary 
point and multiplying all subsystems by going round 
the loop once. When analyzing the performance of 
our gravitational wave sensor it is useful to create a 
measurement error budget. For the analysis of cal- 
ibration errors, the error budget describes the noise 
sources introduced by the various subsystems in the 
feedback loop. In general, the individual noise contri- 
butions are either directly measured or inferred, using 
different methods. In particular, these methods are: 

• The time-dependent part of the sensing function 
is measured by injecting digital signals of known 
shape, prior to the actuation. 

• The calibration of the actuation function usually 
yields the largest source of errors. Until the fifth 
LIGO science run, the main method to mea- 
sure the actuation function was the so-called 
free-swinging Michelson technique. Recently, a 
new method, called photon calibrator (PCal) 
has been introduced; it uses a laser to push the 
end mirrors with a known radiation pressure. 

• The digital filters D(f) are very well known 
functions to which we do not assign errors. 

For a full treatment of different gravitational wave 
interferometer calibration techniques, and the errors 



related to them, see [29l l3lti33l l36l 137] . Note that 
the time dependent part of R{t, /) is slowly varying, 
with time scales on the order of days, while the typi- 
cal signals of our interest occur on time scales of sev- 
eral minutct^] By preallocating the errors due to the 
time dependence of the length response function, we 
will commit to a slight abuse of notation and write 
R(t, /) = R(f), and include the time dependent mea- 
surement errors associated with R(t, /) to the mea- 
surement of a(t). 

The transfer function R(f) is a complex function. 
Hence, we can write it in polar form: 



R(f) = Aif)e*^. 



(2.7) 



Once the transfer function is known, the DARM can 



be calculated directly using Eq. (2.1) from which the 
strain follows immediately: 



d(f) 



AL„ 



L 



(2.8) 



where L is the arm length of the IFO in the absence 
of external solicitations. 



III. CALIBRATION ERRORS 

The calibration procedures are not free from sys- 
tematic effects. In general the transfer function will 
not be known with arbitrary precision, but it will be 
different from the "exact" one. These differences will 
be present both in amplitude and in phase: 



R m (f) = [A + 6A] 



A<t>+s<t>) 



SA 
~A 



(3-1) 

Henceforth we will use an index e to denote the 
exact length function, and all the quantities that are 
built from it, and an index m to denote quantities 
which are measured, and hence affected by calibra- 
tion errors (CEs). The errors are usually reported as 
relative errors for the amplitude 5 A /A and as the ab- 
solute ones for the phase (in radians or degrees). 

In the scenario where calibration errors are present 
and not negligible, the experimenter will be using the 
measured transfer function R m {f) and not the correct 
one, therefore the inferred values for the DARM and 
data stream will also be different from their true value. 



From Eqs. (2.1), (2.8) and (3.1) 



1 There are other kind of longer signals, which arc scientifically 
interesting (e.g. stochastic background, pulsars signals) but 
they are not considered in this work. 
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d m (f) = R m (f) 



L 



K(f)d e (f) (3.2) 



where, in order not to burden the formulae, we have 
introduced a function K(f) that conveys the errors 
for both phase and amplitude: 



* (/) = [i + ^K W) 



(3.3) 



When a GW signal s(f) and noise n(f) are present 
in the data, they will be affected by the errors in the 
same way: 

d m (f) = n m (f) + s m (f) = 
= K(f)d e (f) = K(f) [n e (f) + s e (f)} (3.4) 



which straightforwardly gives: 

Sm(f) = K(f) Se (f) 

n m (f) = K(f)n e (f). 



(3.5) 
(3.6) 



Note that the errors do not affect what is really 
happening in the IFO, which is the error signal, but 
only the way in which this quantity is interpreted by 
the observers in terms of data stream. 

The effects of CEs on detection statistics, and SNR, 
have been already the object of the work of several 
groups. It is known that CEs do not affect the optimal 
SNR [45]. This is easily verified starting from the 
definition of the optimal SNR p: 



P 



df 



s(f) 



(3.7) 



where we have introduced the one-sided noise spectral 
density (PSD) S(f), which is the Fourier transform of 
the noise autocorrelation function. There are several 
equivalent definitions for this quantity. The one we 
find the most useful is (see 07]): 



S(f-f)S(f) = 2{n(f)n*(f')) 



(3.8) 



where the ( ) indicates an average over an ensemble 
of noise realizations. We can easily infer the effect of 



CE on the noise PSD, using Eq. (3.6 1: 



S m (f)<x(n(f)n*(f)) = 



1 



SA(f) 
A(f) 



S e (f) (3.9) 



which shows how only amplitude errors affect the noise 
PSD. From Eq. (3.7) and (3.9) the invariance of the 



Pm = 4 



= 4 



U d{ s m (f)s m (fT 



= 4 



f" Se(/)Se(/)* [l + SA/Af 
' S e (f) [I + 5 A/ A]* 

Se(f)Se(fY 



J 



Pe 



(3.10) 



optimal SNR follows nearly immediately: 



On the other hand, CEs do affect the actual SNR 
recovered by detection pipelines. In Ref. [43] it was 
theoretically calculated that the effect of CEs on the 
recovered SNR are of second order, for small errors. 
This fact was then verified experimentally, using hard- 
ware injections, during the first science run of the 
LIGO instruments ([55]). finding that the recovered 
SNR depended quadratically on the time dependent 
part of the sensing function, a(t). 

Theoretical approaches to the effects of CE on sig- 
nal detection and template bank searches have been 
pursued in Refs. [3"8"H4*0"] . In [55] these studies were ex- 
tended to include the effects of parameter estimation 
for various kind of signals. A theoretical study that 
makes use of Bayesian analysis is being performed by 
one of the authors |42) . 

Without going into details, it seems clear that cal- 
ibration errors have the potential to impact the mea- 
surement of all of the source parameters - masses, 
sky location, distance, inclination and orientation - 
because of the complicated correlations that exist be- 
tween these parameters. Therefore, precisely evaluat- 
ing the impact of calibration errors requires a careful 
numerical analysis that coherently fits all parameters 
simultaneously, and this is the analysis we present in 
subsequent sections. 

Here we rely on approximations to crudely estimate 
the most significant biases due to possible calibration 
errors. The intrinsic parameters (the two component 
masses, and potentially spins, though we do not con- 
sider these here) leave a very strong signature on the 
phase evolution of the gravitational waveform, and are 
primarily measured through phase rather than ampli- 
tude information. The sky location can be estimated 
by timing triangulation between the arrival times of 
the GW signal at different detectors. The inclination 
and orientation angles are functions of the relative sig- 
nal amplitudes and phase shifts at the detectors, while 
the distance is given by the overall signal amplitude 
once other parameters are known. These angles and 
distance are strongly correlated with each other, but 
relatively weakly correlated with the intrinsic param- 
eters. 

Calibration errors can be divided into three 
types: timing errors, amplitude errors and frequency- 
dependent phase errors, and one can estimate the per- 
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missiblc ranges on the three error types subject to the 
condition that systematic biases must remain below 
statistical measurement uncertainties. 

1. Timing errors. These primarily affect sky local- 
ization by influencing timing triangulation, and 
can be seen as a special case of phase errors de- 
scribed below (phase errors with linear depen- 
dence on the frequency). A source can be timed 
to a 0(1/SNR) fraction of a wave cycle, with the 
best timing happening at the "bucket" of the 
noise spectrum, around 100 Hz. Thus, we may 
expect timing accuracies of order a millisecond. 
Meanwhile, the typical baseline (separation be- 
tween detectors) is of order 10 milliseconds of 
light travel time, leading to statistical measure- 
ment uncertainties of order 10 degrees for a pair 
of detectors. Timing errors will, therefore, be- 
come significant relative to measurement uncer- 
tainties only if they constitute a significant frac- 
tion of a millisecond, and calibration-induced bi- 
ases should be negligible for timing errors of less 
than ~ 0.1 ms. (Note, however, that measure- 
ment errors improve with more detectors, so an 
expansion of the detector network will increase 
constraints on timing errors.) In this work we 
will not consider this kind of errors, as the ac- 
tual timing errors measured by the calibration 
teams j^Hl ISD E2] are much smaller than the val- 
ues which might lead to large biases. 

2. Amplitude errors. If constant amplitude errors 
lead to a fixed scaling of the measured ampli- 
tude in all detectors, they would only affect 
the distance estimate and none of the other pa- 
rameters. Distances are not particularly well- 
measured by GW networks, with typical frac- 
tional uncertainties of perhaps 300/SNR%, so 
for an individual source, amplitude calibration 
errors of under 20% should not lead to dominant 
systematic errors, except for the loudest eventsj^] 
Of course, amplitude calibration errors will not 
be identical in the various detectors, so inclina- 
tion and orientation will be affected along with 
distance, but due to the difficulty of measuring 
these parameters precisely, similar constraints 
apply. Frequency-dependent amplitude errors 
should not significantly influence parameter es- 
timation for nonspinning signals, since estimates 
will primarily be sensitive to a (noise-weighted) 



2 It is worth pointing out, however, that if amplitude calibra- 
tion errors stay constant over the run, these distance biases 
would be constant unlike the randomly fluctuating measure- 
ment uncertainties, so they could have a pernicious effect 
on analyses that combine observations of multiple sources to 
study cosmology 1241 1251 . 



average amplitude; however, spin measurements 
are sensitive to modulations of signal amplitude 
which could mimic the effects of orbital preces- 
sion, hence such errors could cause more prob- 
lems if spin parameters are also being estimated. 

3. Frequency-dependent phase errors. Frequency- 
dependent phase errors are, perhaps, the most 
dangerous of all, since they can influence the 
measurements of the binary's intrinsic parame- 
ters. Such errors can mimic the effects of dif- 
ferent post-Newtonian corrections to the phase 
evolution, leading to systematic biases in the 
measurements of the masses. However, these 
phase errors are localized in frequency and do 
not accumulate over the inspiral. Therefore, 
sensitivity to these errors is limited by the over- 
all measurement uncertainty on the waveform 
phase, which is expected to be on the order of 
1/SNR of a cycle at the bucket, and worse else- 
where. Therefore, frequency-dependent phase 
errors of less than ~ 10-20 degrees should not 
lead to significant biases for all but the strongest 
signals. 

The rest of the paper is dedicated to the system- 
atic study in the context of Bayesian inference of the 
combined effects of phase and amplitude calibration 
errors on parameter estimation for GW signals emit- 
ted during the in-spiral of compact binary systems 
whose components are not spinning. 



IV. BAYESIAN MODEL SELECTION AND 
PARAMETER ESTIMATION 

An excellent introduction to Bayesian model selec- 
tion, and its application to GW detection an parame- 
ter estimation can be found in |67j . In this paragraph 
we will only summarize the main results and nomen- 
clature we will use in the remainder of this work. 

Given a set of data d and some prior information 
I, the probability for a model (or hypothesis) Hi is 
given by Bayes' theorem: 



P(Ui\d,I) = 



P(JU\I)P(gHi 1 I) 
P{d\I) 



(4.1) 



where P(Hi\I) is the prior probability for the hypoth- 
esis Hi, and P(d\Hi,I) is the posterior probability for 
the data given that the hypothesis Hi is true, also 
called the likelihood for the data. The factor in the 
denominator, P(d\I), is the marginal probability for 
the data, integrated over the different hypotheses or 
models. 

Without enumerating all the different models, we 
can calculate the relative weight between two of them 
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(the odds ratio), using Eq. (4.1). More precisely, the 



odds ratio of a model "H, and a model Wa is: 



= P(Hj\I) P(d|Kj,J) = P(H,iJ) 



(4.2) 



where we have introduced the Bayes factor Bij, or 
ratio of likelihoods, between model Hi and model 
Wj. Note that the marginal probability for the data, 

P(d\T), cancels out when the ratio is calculated. 

In a typical scenario, the GW signal will depend on 
a set of unknown parameters 9 that we want to esti- 
mate. These can be both extrinsic parameters, such 
as the position of the GW source on the sky, and in- 
trinsic parameters, such as the mass of the component 
stars. If we indicate with O the parameter space in 
which 9 dwells, we can obtain the likelihood for the 
data given the generic model % by marginalization of 
the likelihood given a particular realization of 9, and 
obtaining the evidence Z^: 



P(d\H,I) 



P {6\H,I)p(d\H,0,I)d6, (4.3) 



where we have introduced the prior probability distri- 
bution p(#|"H, /) for the parameters 9 over the param- 
eter space. From the evidence, the posterior distribu- 
tions for the parameters 9 given the data are easily 
obtained using Bayes' theorem: 



p(0\d,H,I) 



p(e\n,i)p(d\,e,H,i) 



(4.4) 



Given the high dimensionality and the analytical 
form of the functions involved, the integral (4.3) can- 



not be calculated analytically, and one has to rely on 
numerical methods. For our computations, we relied 
on the Nested Sampling algorithm ( 46J) in the form 
in which it has been implemented for the LIGO Algo- 
rithm Library (LAL) [H] by Veitch and Vecchio [67]. 

In what follows, we will consider two hypotheses: (i) 
T-Lm will be the hypothesis according to which the data 
consist solely of noise; (ii) Us will be the hypothesis 
that the data consist of noise plus a GW: 

H N -> d(f) = n(f) (4.5) 
H s ^d(f) = n(f) + s(f,6) (4.6) 

where we have made explicit the signal dependence 
on the unknown parameter vector 9. If we assume 
that the noise in the IFO is stationary and Gaussian 
the likelihood for the data for the two models can 
be written as: 



p(d\8,n N ,I) oc e ~ {d{f)W))/2 (4.7) 
p(d\6,H s ,I) oc e -W)-H§)\d{f)-H§))/2 5 (4 8) 

where h(f, 9) is the GW signal, and we have defined 
a noise- weighted inner product: 



(a(/),6(/))=2B 



A~ S(f) 



Once the analysis is done for a given data stream, 
one is provided with two pieces of information: 

• The Bayes factor between the models Hs an d 
Hn (BSN for Bayes Signal vs Noise) which tells 
how confident we arc that there is a signal buried 
into the noise. 

• The posterior distributions for the unknown pa- 
rameter on which the signal (if present) depends, 
which allow estimates for the physical and ex- 
trinsic parameters of the GW source. 

The method is easily generalized to the case where a 
coherent analysis is being performed, using a network 
of several IFOs. If we indicate with d^ J \f) the data 
stream in the J-th detector, the likelihood of having a 
signal or only noise i n the J-th dete ctor will be exactly 



the same as in Eqs. (4.7) and (4.8), with d «-> d (J) . If 



the detectors are far enough apart that the noise in 
one is not correlated with the noises in the others, the 
likelihood for each IFO is statistically independent of 
the likelihood for the other instruments, and a joint 
likelihood can be built just multiplying the single IFO 
expressions: 



p(d\9,H k ,I) = Y[ P 0^\6,H k ,I) , (4.9) 

(J) 



with k = N or k = S. Eq. (4.9) can be used to 



calculate the network evidence, and perform coherent 
analysis. 



This is not true in general, as the noise in the IFOs is a 



combination of smaller Gaussian fluctuations and larger non- 
Gaussian outliers ("glitches" in the data). The use of coinci- 
dent requirements between different sites and a whole set of 
data quality and vetoes procedures help reducing the num- 
ber of glitches |74H76| . New techniques are being developed 
to deal with residual non-Gaussianity 78] . For simplicity, in 
this work we will assume that the candidates events which 
survive all of these checks are buried in Gaussian noise. 
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V. METHOD 

A. Analysis and Noise Model 

We have tested the effects of CE on PE using soft- 
ware injections, i.e. artificially adding signals of known 
shape into simulated noise, for a network consisting of 
the two advanced versions of the LIGO and Virgo in- 
struments. We have used the analytical expressions 
for the noise spectral densities as coded in LAL [44] . 
The square root of S(f) for advanced LIGO and Virgo 
is shown in Fig. [2j 
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Figure 2. (color online) The high-power, zero-detuning 
noise curve for Advanced LIGO (red continuous line), 
and the BNS-optimized Advanced Virgo noise curve (blue 
dashed) 

To be more precise, for each IFO, a GW signal s(f) 
is added to a stream of noise generated using the de- 
signed noise PSD for that IFO, rS J \f) to form the 
data vectors 



posterior distributions obtained from our pipeline in 
the two cases. We kept fixed all relevant parameters 
of the injection and of the noise generation. The only 
difference between the two datasets are the presence 
of calibration errors in one of them. 

In the next few subsections we will discuss in detail 
which GW model waveforms have been used and how 
the calibration error curves have been generated. 



B. Waveforms and parameter space 

When software injections are used to test a param- 
eter estimation pipeline, there are three major factors 
to take into account: (i) the signal being injected, 
(ii) the waveform used to recover the signal (known 
as template), and (iii) the noise added to the signal. 
The noise model we employed has been described in 
Sec. |V A| hence in this section we will proceed in the 
description of (i) and (ii). 

The waveform models we used for injections belong 
to the Effective One Body (EOB) family gHHSS]. 

Without entering into details, which can be found 
in the references above, the main idea behind the EOB 
approach is to treat the two-body problem as an ef- 
fective one-body problem, as if a mass equal to the 
reduced mass of the system were moving in some ef- 
fective space-time metric [SO]. The EOB's main in- 
gredient is the effective Hamiltonian, from which the 
evolution of the radial and angular coordinates, as well 
as their momenta, can be calculated using Lagrange 
equations. This allows to write the GW signal, as a 
function of the reduced time t = t/M (M being the 
total mass of the binary system) as: 



Hi) =t£ (£)«»(¥>(£)) 



(5.2) 



where w u is a power of the angular velocity, obtained 
deriving the phase with respect to the reduced time: 



(f) 



(/), 



(5.1) 



that are combined to form a joint likelihood, Eq. ( |4.9[ ), 
which is evaluated by the Bayesian pipeline. The sub- 
script e indicates that the transfer function used to 
create the stream is the exact one, R e (f)- The final 
outcomes of this analysis will be the BSN e (logarith- 
mic Bayes' factor of the signal hypothesis vs the noise 
hypothesis) and the posterior distributions of all the 
component of 9, from which the mean 0" , standard 
deviation A9" , as well as the median and higher mo- 
ments of the distribution for the parameter 9 a can be 
calculated. 

Once the exact analysis is completed, we proceeded 
with a similar analysis in which we artificially intro- 
duced calibration errors on signal and on noise as in 
Eqs. (3.5) and (3.6). We then compared the BSN e and 



dt. 



and tp(t) is twice the orbital phase: tp = 20. 

It is important to note that using a template family 
which is different from the injected signal's may intro- 
duce a bias in the recovered posteriors for the param- 
eters |38j . However, let us remember that in this work 
we are not interested in the absolute performance of 
the code, or in the match between the injected and 
recovered parameters. What we want to measure, in- 
stead, are the effects of CEs, i.e. how much the poste- 
riors are affected by the presence of CEs. Now, as we 
are dealing with small errors, it makes sense to assume 
that even if a bias was introduced, it would be the very 
similar while recovering s e (/) or s m (f), and will be- 
come negligible when the difference — 6>" (/) is 
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taken, which we use to quantify the shift introduced 
by the CE. With this in mind, we have chosen to use 
a frequency domain template, the Taylor F2 discussed 
here below, because it is known analytically, and no 
differential equations have to be solved, thus the per- 
formance of the code is greatly improved compared to 
more sophisticated models. 

The TaylorF2 waveform [57] is calculated starting 
from the time-domain Post-Newtonian (PN) approxi- 
mation of the signal: 



hit) = v 2 (t)cos{ip(t)) 



(5.3) 



which looks equal to Eq. ( |5.2[ ). The difference is that 
now the amplitude and the (double of the) orbital 
phase are calculated starting from PN expansions of 
the energy flux and luminosity, and assuming that the 
adiabatic approximation holds; and are known func- 
tions of the system's parameters (see [58] and refer- 
ences therein). 



The Fourier transform of Eq. (5.3) 
can be analytically calculated using the so-called sta- 
tionary phase approximation [59] , which consist in de- 
veloping the phase of the signal around its stationary 
point. The final result is: 



Hf) 



Q{9,<t>)M% 




(5.4) 



where the phase is given at the 3.5 PN order by: 

m = 2nft + O - \ + £ a k v k (5.5) 

' fe=0 

and v = (jrMf)i. The coefficients a.;, that depend on 
the total and symmetrized mass, can be found in [60, 
[ST]. The function Q(0, 4>) depends on the coordinates 
of the source in the detector frame. When more IFOs 
are used to perform coherent analysis, one has to use 
a common frame, and the functions Q will depends 
both on the spherical coordinates of the source in the 
common frame and on the Euler angles that rotate 
the detector frame to the common frame 41 . 

The signal emitted by a binary system with zero ec- 
centricitjlj and nonspinning components will depend 
on nine parameters: 

• A reference time (usually the detection time, or 
the coalescence time) and the phase the wave- 
form had at that time: to and 4>q. 

• The total mass M = ra\ + m,2, and the symmet- 
ric mass ratio ri = m l" 2 , 2 . The chirp mass 

I (mi+m 2 ) r 



4 By the time the system's frequency enters the Ligo- Virgo 
bandwidth, most of the eccentricity will have been radiated 
away [34] , which is why it is usually neglected in the LIGO- 
Virgo literature. 



M. = r\^M is often used instead of the total 
mass, as it is generally the best-determined vari- 
able. 

• The luminosity distance of the system, D. 

• The polarization angle, ijj |63j . 

• The angle formed between the line of sight and 
the system orbital angular momentum, i. 

• The coordinates of the sources in the common 
frame, right ascension (RA) and declination 
(dec). 

The injections were collected in three catalogs, 
each one representative of a different kind of bi- 
nary system, composed of two neutron stars (BNS), 
two black holes (BBH) or a neutron star and a 
black hole (BHNS). We will denote those catalog as 
Sj with j=BNS, BBH, BHNS. We have assumed that 
a NS has a mass in the range [1.4, 2.3]M Q and BH 
in the range [9.0, 11.0]M Q . While there are scientific 
reasons to believe that the mass of a NS is in that 
range [52], for the BH the range of allowed masses is 
much broader, going from a few solar masses up to 
thousands of solar masses for the black holes in the 
center of the galaxies. We have chosen a range cen- 
tered around IOMq as that is the value most often 
used in the GWs data analysis literature. For each 
catalog, the distances of the signals were randomly 
drawn from ranges chosen in such a way that the corre- 
sponding SNR would have values like those we expect 
from detections with the Advanced Interferometers. 
The corresponding mass for the two objects, and the 
distance, for binary systems in the classes above are 
given in Table [TJ 





mi 


m 2 


D 


BNS 
BBH 
BHNS 


[1.4, 2.3] Mq 

[9.0, 11.0] Mq 

[1.4, 2.3]M 


[1.4, 2.3] M 
[9.0, 11.0] M Q 

[9.0, 11.0] Mq 


[150, 220] Mpc 
[700, 1000] Mpc 
[300, 500] Mpc 



Table I. Mass and distance ranges for the systems consid- 
ered 

Each catalog was filled with 250 signals, whose cor- 
responding masses and distances were generated by 
sampling uniform distributions on the intervals indi- 
cated in Table [TJ The other parameters, the sky po- 
sitions of the sources as well as the polarization and 
inclination angles, were generated by sampling uni- 
form distributions on the 2-sphere. 

It is worth noticing that the only things that change 
while going from the I-th event of one catalog to the 
I-th event of another catalog are the masses and dis- 
tance, while the other parameters are the same. This 
implies that we can use this work to quantify the ef- 
fects of CEs on signals having comparable masses but 



different positions, polarization, inclination and dis- 
tances (this is done analyzing each catalog) and the 
effects on signals having the same positions, polar- 
izations and inclination, but different masses (this is 
done comparing a catalog with the others). 



C. Generating calibration errors 

It is a reasonable assumption that, at the beginning 
of the advanced detectors era, the errors in the cali- 
bration process will not be much different from what 
they were during the last part of the initial detectors 
era [7]] [72]. 

In order to have a good statistical sample, and take 
into account possible slow time variation, due to a(t), 
we have generated 10 different error curves for each 
IFO, for both phase and amplitude. 

Each of these curves was created using the following 
method: 

• Read the typical width of the 1-sigma calibra- 



tion errors curves during the last stages of the 
Initial detector era 

• Draw 15 points in the frequency space, uni- 
formly in log/, from Gaussian distributions 
with zero (one) mean for the phase (amplitude) 
uncertainties 

• Fit these points with a polynomial of degree 7 
to obtain a smooth parametrized curve. 

The aforementioned process was repeated using dif- 
ferent seeds for the initialization of the random num- 
ber generator so to obtain different curves. An in- 
stance of the different realizations we generated is 
shown in Fig. [3] The interested reader is referred to 
the Appendix, Figs. [13] to [21] for an overview of all the 
realizations. The values of the widths we have used 
are given in the Table [TlJ and refer to the values es- 
timated during the S5 science run for LIGO and the 
third science run for VIRGO ED]- Adopting the 
LIGO- Virgo conventions, we will use the label LI for 
the LIGO instrument in Livingston, HI for the LIGO 
detector in Hanford and VI for Virgo. 





Amplitude errors (%) 




Phase errors (De 


g) 






40-2000Hz 


2-4KHz 


4-6KHz 


40-2000Hz 


2-4KHz 


4-6KHz 


HI 


10.4 


15.4 


24.2 


4.5 


4.9 


5.8 


LI 


10.1 


11.2 


16.3 


3.0 


1.8 


2.0 




40-2000Hz 


2-4KHz 


4-6KHz 


40-500Hz 


500-2000 Hz 


l-2.8KHz 


2.8-6KHz 


VI 


10.0 


10.0 


20.0 


2.29+2.87-10 _;:1 f 


0.5729 +6.3-10 _a f 


6.87 


2.53-10" 3 f 



Table II. The widths used for the error curves generation. 
The phase error width for Virgo depend on the frequency 

f [30] 



We will indicate with SAi/Ai and Scpi the i-th real- 
ization of the amplitude and phase errors. Note that 
drawing the points uniformly in log / is equivalent to 
assuming that there is a correlation length between 
the errors at different frequencies which increases lin- 
early with the frequency. We will consider different 
possibilities in future work, even though the consis- 
tency of the results we have obtained using the vari- 
ous curves in this work (see Sec. VIC below) suggests 



the results are not extremely dependent on the exact 
shape of the calibration error curve. 



VI. RESULTS 

A. Effects on Parameter Estimation 

Because of the different ranges in which each pa- 
rameter can vary, we have normalized the difference 
in the means or medians of the parameters inferred 
from runs with and without calibration errors by their 
standard deviation. More precisely, if 9™ and A9" are 
the median and standard deviation of the parameter 
9 a we would measure for a given signal if we knew 
the exact transfer function, while 9^ is the median 
we measure when CEs are present, we can build the 
quantity: 
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Figure 3. (color online) The first CE realization for the 
amplitude (top) and phase (bottom). 



the meaning of which is clear: it measures the shift 
introduced in the estimate of 9 a by the CEs in units 
of standard deviations calculated from the probability 
distribution for the same parameter in the absence of 
CEs. For each injection, say the i-th, in the catalog 



£j we can calculate the quantity (6.1 1 



E? 



A0, 



=1..250 



(6.2) 



where Oi is the median for parameter 8i. We also 
compute distributions for this quantity for all of the 
injections in the catalog, and for all the parameters 
of the model waveform. The resulting distributions 
will look in general similar to Fig. [4] which shows the 
histogram for the chirp mass Ai measured using the 
BHNS catalog] and the first CE realization. 

Note that the distribution for T, looks quite sym- 
metric and well centered around zero, meaning that 
there is not a net bias introduced by CEs but, in- 
stead, some of the injections in the catalog acquire a 
positive bias while others a negative one. We found 



5 The results are similar for the three catalogs. To avoid having 
too many figures, we have chosen to show plots only for the 
BHNS catalog. It is understood that one would get very 
similar plots for the other two catalogs. 




-"0.4-0. 3-0. 2-0.1 0.0 0.1 0.2 0.3 0.4 0.5 

Figure 4. (color online) The distribution of T. M for the 
signal in the BHNS catalog, using the first CE realization. 
The vertical blue line correspond to a null shift. 



that this behavior is common to all parameters ex- 
cept for the distance. The reason is easy to under- 
stand: with other parameters fixed, the distance is 
inversely proportional to the amplitude of the signal, 
and is therefore directly affected by the amplitude er- 
rors of the transfer function. As an example, in the 
same CEs realization, Fig. [3j the amplitude errors are 
positive for the three IFOs. The over-estimated am- 
plitudes result in an under-estimate of the distance, so 
the source is inferred to be closer than in the absence 
of CEs, Fig. |U 
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Figure 5. The distribution of E D for the signals in the 
BHNS catalog, using the first CE realization. The vertical 
blue line correspond to a null shift. 

As a summary for our results, we will report the 
mean E, and standard deviation AE, of the distribu- 
tion for E a , together with the median, E , the 5^ and 
95* u percentiles, for each parameter and each catalog, 
averaged over the 10 CE realizations. It is important 
to remember that Es represent the effect of systematic 
errors and are not normally distributed. In particular 
2AS does not to contain ~ 66% of the results. The 
results are summarized in Tables |III[ |IV| and [V] 

The distribution for E Q has been calculated using 
only the injections whose network SNR is greater than 
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8, which we used as a proxy for the sensitivity of GW 
searches. It is important to note, however, that ex- 
cluding those injection (which are « 20% of the total 
number) does not affect our analysis in a significant 
way. On the contrary, those weak signals would pro- 
duce posterior distributions with large standard de- 
viations, and thus small Ss, reducing the spreads of 
the Ss. It is interesting to check whether the net bias 
(not weighted by the standard deviation) is a function 
of the SNR. At first one might think that calibration- 
induced systematic errors must be not dependent on 
the SNR, as this is the case, for example, for the bias 
introduced by using wrong templates |77j . When it 
comes to calibration errors, however, there is an im- 
portant difference: not only the template but also the 

However, in_Fig. [6j 



noise is affected (see Eq. 3.6) 



we show Y, for the same signal as in Fig. H] plotted 
against SNR. As the random errors decrease with the 
SNR, the fact that the ratio between the bias and the 
standard deviation (i.e., Y. M ) is not increasing with 
the SNR implies that the net bias is also decreasing 
with the SNR. We conjecture that this is due to an im- 
portant difference between systematic errors induced 
by theoretical waveform differences and calibration er- 
rors: in the latter case, not only the template but also 
the noise is affected (see Eq. (3.6) and Eq. (11) of 
[77]). Finally, we point out that our procedure for es- 
timating bias as the difference in the medians between 
posterior samples in the error-free and CE-affcctcd 
runs includes two effects: a genuine systematic bias 
and a Monte Carlo sampling fluctuation due to finite 
sample statistics. The latter will scale as SNR -1 , and 
could dominate the estimated bias when the bias in- 
duced by calibration errors is very small. Thus, our 
quoted biases represent a conservative upper limit on 
CE-induced systematic errors. We will study these 
issues in a follow-up project. 
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Figure 6. (color online) J2 M for the signal in the BHNS 
catalog, using the first CE realization, plotted against the 
SNR. The fact that the spread does not increase with the 
SNR implies that the net bias 6^ —S^ 1 decreases with the 
SNR. 



The Es have means very close to zero for all the pa- 
rameters, indicating that, when averaging over many 
events and the many CEs realizations, there are no 
preferred directions for CE-induced systematic biases 
in parameter estimates. When it comes to the widths 
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-3.20- 10" 4 


-3.87-10" 1 


3.91-10" 1 


D 
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Table III. The mean E, standard deviation AE, median 
E, 50th and 95th percentile of E for all the parameters 
using the BNS catalog. These numbers are obtained by 
averaging over ten CEs realizations. All the quantities are 
pure numbers (remember the definition Eq. |6.2| of E). 
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Table IV. Same as Table |III| but using the BBH catalog. 



of the S distributions, we can group the parameters 
into three different sets: 

• For the intrinsic parameters 77 and M, and the 
distance, the width is of the order 1— 2x ~ 10 _1 . 

• For the arrival time, the position parameters RA 
and dec, and the inclination, the widths are a 
few times larger, ~3-5x 10 _1 . 

• The polarization and arrival phase have very 
large standard deviations, so the much smaller 
spread in their a is a consequence of their large 
standard deviations. 
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Table V. Same as Table [TTTl but using the BHNS catalog 
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The averaged numbers we gave in Tables III IV and 
[V] describe the typical scenario, as they were obtained 
averaging among the 10 CE curves, reducing the im- 
pact of CE curves which had produced the largest 
spreads. An alternative representation is shown in 
Fig. [7j where we plot the median of £ for each pa- 
rameter (except ip and </>o, as we have seen they are 
always estimated with huge errors) averaged over the 
10 CE realizations, with error bars whose min and 
max values are the worst 5th and 95th percentiles en- 
countered in the 10 CE runs. These error bars yield 
a conservative estimate of the impact of calibration 
errors when the actual CE realization and the statis- 
tics of the injection parameters line up to produce the 
largest shifts in parameter estimation. 
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Figure 7. The median of £ averaged among the 10 CE 
realizations. The lower end of the error bars corresponds 
to the lowest 5th percentile encountered in the various CE 
runs, while the upper end corresponds to the highest 95th 
percentile. We do not show ijj and <f>o as those parameters 
are very poorly estimated. The upper panel refers to the 
BNS catalog, the middle one to the BHNS and the bottom 
one to the BBH catalog. 

Apart from the ID results we have reported, it is 
interesting to verify how the confidence in our knowl- 
edge of the position of the source in the sky changes 
because of the CEs, as this will capture the joint vari- 
ation of RA and dec, taking into account their corre- 
lation. Let us call M e = (dec e ,RA e ) the point in the 
unit sphere whose spherical coordinates are given by 
the median value of RA and dec calculated in the ex- 
act run. Using the line element of a 2D sphere, we can 
write the size of the random error in the estimation of 
M e as 

e 2 e ee AdeCg + sin - dcc e ) 2 ARAj. 



Adding the CEs will similarly yield the median sky 
location M m =(dec , RA m ), and we can measure the 
distance in the unit sphere between the points M e and 
M m : 



(7T \ ^ n 

--dcc e J (RA m — RA e ) 2 

We weight the distance between the exact and mea- 
sured position in the unit sphere by the size of the 
random error box of the exact run: 



CTEE^, (6.3) 

with (7 — implying that the shift introduced by the 
CEs is null, and a > 1 that it is larger than the un- 
certainties due to the noise. In Fig. [8] we show the 
median of a, together with 5th and 95th percentiles, 
for all the CEs and the three mass bins. 



1.2r 
1.0 

0.8 
b0.6 
0.4 
0.2 
0.0 L 



I 



I 1 i I I 1 L 



2. Or 

1.5 

bl.O- 
0.5 
0.0 



I T 



2 4 6 8 10 

CE curve 



Figure 8. The median of er (introduced in the main text) 
when using the various CE curves (shown in the abscissa 
label) and the three mass bins (from the top to the bot- 
tom: BNS, BHNS, BBH). The error bars show the 5th and 
95th percentiles. Note that the ordinate scale varies in the 
subplots. 

It is evident that CE curve 2 leads to average shifts 
which are much larger than for the other CE curves 
(the median of a is larger than 0.5 in the three cata- 
logs) , and to very large spreads (95th percentile larger 
than 1.6). Note however that we are weighting the dis- 
tance in the unit sphere by the width of the random 
error box of the exact run. Thus a large value of a 
does not imply a large shift in radians. We have in- 
deed verified that some of the signals that go in the 
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tails of the distribution of a are high-SNR signals, for 
which e e is very small, and so is e me , even though their 
ratio may be ~ 2 — 3. 

It is known that in a three-interferometer network, 
if the position of the source were to be estimated 
using just time triangulation, there would be a de- 
generacy corresponding to a reflection of the position 
with respect to the plane that contains the three IFOs 
|64[ 169] . In reality, amplitude information and corre- 
lations with the remaining parameters also affect the 
sky localization (e.g. disentanglement of the plus and 
cross polarization) and break this symmetry. In this 
way, one of the two specular positions can be actually 
preferred and assigned a higher probability. Pertur- 
bations to the phase of the injected signal, like the 
ones introduced by the calibration phase errors, may 
change the situation and push our inference towards 
the reflected position. 

We have found three signals (one in the BBH cat- 
alog and two in the BHNS catalog) for which adding 
the CEs leads to the aforementioned behavior. More 
precisely in two cases the signal was found in the spec- 
ular position with respect to the IFOs plane; in the 
third case it was found in a position belonging to the 
ring on the sky which assures the same Hl-Ll time de- 
lay (this is discussed for example in [73] for a network 
made of HI and LI only. Although we are using three 
IFOs in this work, for the event we are discussing now, 
the SNR in Virgo was 4 times smaller than the SNR in 
HI and LI, which explain why the result is similar to 
a Hl-Ll network.). This phenomenon happened only 
with a few CEs curves (3 out of 10). After a thor- 
ough analysis we have concluded that this behavior 
was not solely due to the addition of CEs but also to 
the particular noise realizations for those events. In 
fact, we have rerun the analysis on those signals, using 
100 different noise realizations, finding that only 8% of 
the noise streams, in conjunction with the CEs, would 
lead to the aforementioned large shifts. Considering 
that these outliers were nine (3 signals times 3 CEs 
curves), over the initial set of 7500 signals, and that 
only 8 noise realizations over 100 produced them, we 
concluded that the probability of such extreme shifts 
is ~ 0.1% and we did not take them into consideration 
while writing Tables [TV| and |V] 



the model and the data. Because of its huge range of 
variation, it is usually the log of this quantity which 
is reported, logBSN. Hence we will quote the natural 



B. Effects on Bayes factors 

The main outcome of the Nested Sampling code is 
the Bayes' factor, a measure of the confidence in the 
hypothesis that a signal is buried in the noise. 



To be more precise, the evidence (Eq. 4.3 ), and thus 



logarithm of the Bayes' factor, as defined in Eq. 4.2 



In [57] a method was described in which the logBNS 
could be used as a detection statistic. It was shown 
how, if one assigned equal prior probability to the 
presence of a signal, as opposed to the presence of 
pure noise, a threshold of BSN ~ 2.8 could be set, 
such that the 99% of the analyses which gave a BNS 
> 2.8 contained a signal. A more refined estimation, 
which takes into account our knowledge on the rates 
with which GWs should be detected, sets this thresh- 
old to - 20 [55] . 

It is then interesting to study, beside the systemat- 
ics that CEs introduced in the estimated parameters, 
the effects they might have in the estimation of the 
Bayes factor, as large shifts may decrease the confi- 
dence we assign to a detection. Moreover, comparing 
the bayes factors with and without calibration errors 
is a direct measure of how much worse the fit is over- 
all. We have complemented our analysis by investi- 
gating this issue In Fig. [9] we show, for all the injec- 
tions in the BHNS catalog, the difference between the 
average of the measured logBNS over the ten CE re- 
alizations and the exact log Bayes factor, logBSN e : 
(logBNS m ) — logBSN e , where we have indicated with 
wedge brackets the average over the CE realizations: 
(logBNSJ = _ij£l=il°gBNS«, plotted against the 
optimal SNR^] We also show error bars corresponding 
to the spread of logBSN m amongst the CE realizations 
and we colored the points according to the logBSN e 
of the injections. 

It is evident from Fig. [9] that the higher the opti- 
mal SNR (and consequently logBSN) of the injected 
signal, the bigger the impact of CEs on the logBSN. 
In fact, a signal with a high SNR will be "clearly" 
detected by the PE code, and well matched with the 
right template. In this scenario, the disturbances due 
to CEs are more visible (i.e. the change in logBSN 
larger) than in a low SNR scenario. When the signal is 
hardly detected, CEs add only some extra mismatch. 
In general the effects are very small, the average shift 
in logBSN over the three mass bins and the 10 CEs 
curves we have considered being 0.9%, with the binary 
neutron star systems being the most affected (1.8%). 

We can then conclude that, if the Bayes factor was 
used as a complimentary piece of data in assessing the 
confidence of a detection, it would represent a reliable 
help, being barely affected by calibration errors. 



the Bayes' factor, which is the ratio between the evi- 
dence of two models (Eq. |4.2[ ), is the measure of the 
fit of the data to the model. Being marginalized over 
all the parameters, it shows the mean match between 



As the optimal SNR is unaffected by CEs, Eq. ) |3.10| , we are 
allowed in Fig.[9]to use a single x axis. 
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Figure 9. (color online) The difference in log Bayes factor 
between the exact run and the average of the runs with 
calibration errors. The values in the colorbar correspond 
to the BSN e produced by the injections. Generally, louder 
signals are affected by larger shifts. 



C. Comparing the CE realizations 



the results of the runs with the various CEs are compa- 
rable, but the error bars are generally larger than for 
the intrinsic parameters, meaning that those parame- 
ters are more affected by the calibration errors. Sky 
localization is most strongly affected by differences 
in amplitude calibration errors in different interfer- 
ometers at frequencies where the interferometers arc 
most sensitive. This is particularly true for Hanford 
and Livingston interferometers, which are relatively 
nearby and nearly aligned, meaning that any incoher- 
ence in the recovered amplitudes can not be fit by 
adjusting the inclination or polarization of the source, 
and can influence the recovered sky location. There- 
fore, it is not surprising to see much larger variations 
for the second and sixth CE realizations, for which 
the amplitude corrections for HI and LI have opposite 
signs near 100 Hz (see Figs. 13 and 17 1. Meanwhile, 



e.g., the fifth CE realization has very comparable am- 
plitude CEs for HI and LI at 100 Hz (see Fig. ph, 



matching up to the small range of normalized system- 
atic biases in RA (see Fig. 



10) 



The data analyst will not know the exact shape and 
magnitude of the CE the data are being affected from; 
it is then an interesting exercise to study how the ef- 
fects of the errors vary with the CE curves' shape. 

To study how parameter estimation reacts to the 
CE curves, we show how the median and standard 
deviations of the Ss of the various parameters vary 
among the ten CE realizations in the three catalogs. 
For example, in Fig. [lO] we plot the median of T, v 
(mass ratio) over the injections in the BHNS catalog, 
together with their standard deviations, for each CE 
realization (labeled in the X axis). 
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Figure 10. Mean of T, v 
for the BHNS catalog 



with the various CE realizations 



It is quite remarkable as all the CEs give IT 1 with 
similar averages, the largest difference being ~ 0.07. 
A similar plot is obtained for the chirp mass. In Fig. 
11 we show the same plot for RA (note that the y axis 
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b igure 11. Mean of T, RA with the various CE realizations 
for the BHNS catalog 
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are not 



The medians of S for the distance, Fig. 
centered around zero. This is not unexpected, as we 
have pointed out earlier that the distance estimation 
is directly affected by the amplitude errors. 



VII. CONCLUSIONS 

In this work we have quantified in a systematic 
way, for the first time in the literature, the effects 
of calibration errors on the estimation of parameters 
of gravitational waves emitted by binary systems with 
non-spinning components. We have considered three 
mass bins, and for each bin we have created a cata- 
log with 250 sources, uniformly distributed in the sky. 
A Bayesian parameter estimation code was run on all 
the injections of these catalogs, first using the exact 
transfer function (i.e., without calibration errors), and 
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Figure 12. Mean of E D with the various CE realizations 
for the BHNS catalog. 



then after transforming the data with one of the ten 
calibration error curves we have generated. We have 
then compared the posterior distributions, as well as 
the Bayes factors, of the runs where the errors were 
added with the control runs, where no errors were 
present. 

We found that for all the error curves considered, 
the effects are small, the systematic shift introduced 
in the estimated parameters being a fraction of the 
statistical measurement errors. We also considered 
the effect of calibration errors on Bayes factors, finding 
that it is larger for louder injections, but always small 
enough that no signals would be missed because of 
calibration errors by a putative pipeline that would 
rank events by Bayes factors. 

Furthermore, we have found that the different cal- 
ibration error curves we considered yield compatible 
results, implying that the distribution of CE-induced 
shifts in parameter estimates does not strongly depend 
on the exact shape of the CE curves. 

The inclusion of spins in the waveform model will 
lead to additional complications, and should be the 
subject of a future investigation. 
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Appendix A: Error curves 

In this section we show nine of the ten calibration 
error curves. The remaining one was given in the main 
text, Fig. [3} 
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Figure 13. The second CE realization for the amplitude 
(top) and phase (bottom). 
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Figure 14. The third CE realization for the amplitude 
(top) and phase (bottom). 



Figure 16. The fifth CE realization for the amplitude (top) 
and phase (bottom). 




Figure 15. The fourth CE realization for the amplitude Figure 17. The sixth CE realization for the amplitude 
(top) and phase (bottom). (top) and phase (bottom). 
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Figure 18. The seventh CE realization for the amplitude 
(top) and phase (bottom). 



Figure 20. The ninth CE realization for the amplitude 
(top) and phase (bottom). 




Figure 19. The height CE realization for the amplitude Figure 21. The tenth CE realization for the amplitude 
(top) and phase (bottom). (top) and phase (bottom). 
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